Our exploratory data analysis examines patterns that inform both research questions about usage context and feature adoption. We organize our exploration into four main categories:
Figure 5: Website Usage Analysis - Distribution of blocked websites by category (top) and frequency of individual websites (bottom)
The analysis of blocked websites reveals distinct patterns in how users interact with the Jargon extension. Professional tools—particularly Salesforce and AI platforms—are the most frequently blocked, suggesting that users tend to avoid using Jargon during work-related activities. The presence of development environment blocks indicates that some users are technical professionals, though this group represents only a modest portion of the overall user base. Educational content also features prominently among blocked websites, with users often disabling the extension on documentation sites and learning platforms, possibly to maintain focus during concentrated study sessions.
However, it is important to note that there are only 27 blocked sites across 92 users. This limited usage suggests that the blocking feature is not widely utilized, and the current data may not be conclusive. Caution should be exercised when generalizing these findings, as they may not fully represent the broader user population.
Figure 6: Scatter plot showing the relationship between user adoption and question generation across different language modes
The scatter plot highlights key patterns in language mode usage: - Spanish is the most active mode, with the highest number of questions (~800) and users (~30). - GlizzyTalk and Tamil show moderate engagement (~300 questions each). - Korean and GRE Vocabulary form a middle tier (~200 questions). - Most other languages have low adoption, with fewer users and questions. - Some modes (e.g., Tamil) have high question counts despite fewer users, indicating intensive use by dedicated learners.
Overall, while usage intensity and adoption vary widely across languages, traditional language learning modes drive most activity.
Figure 7: Word frequency analysis showing common words (top) and word pairs (bottom) in learning content.
Insights from Word and Phrase Frequency Analysis (based on the English original sentences selected for content generation):
Overall, the word frequency analysis reveals that users are engaging most with scientific and descriptive content, focusing on process-oriented vocabulary and recurring technical terms.
Figure 8: Daily activity patterns showing question generation and active users with their respective averages (red dashed lines) over the observation period, based on UTC timezone.
Figure 9: Weekly activity patterns showing average questions generated and active users by day of week (UTC timezone), with error bars indicating standard error and overall averages shown as red dashed lines.
The temporal analysis reveals several key patterns in user engagement, based on both daily and weekly activity (all timestamps in UTC):
Daily Trends:: Question generation and active user counts fluctuate considerably day-to-day, with occasional spikes (up to 200 questions or 12 users), but most days remain below the average (12.5 questions, 2.2 users).This indicates a small but steady user base, with 1–5 active users on most days.
Weekly Trends: Question generation is highest on Mondays, Tuesdays, and Wednesdays, then tapers off toward the weekend,suggesting users are more engaged during the workweek. There is substantial variability across days, as shown by the error bars.
Together, these patterns indicate that Jargon’s usage is characterized by low but regular engagement, with activity peaking midweek and significant day-to-day variability. This suggests a core group of users who interact with the platform most during the workweek.
Figure 10: Distribution of key engagement metrics across users, showing individual violin plots for each metric with median and interquartile range (IQR) statistics. Each plot uses a distinct color and includes summary statistics.
The violin plots provide a clearer view of the distribution of user engagement metrics:
Overall, the violin plots highlight that engagement is highly skewed: most users interact minimally, while a small subset are much more active or exploratory. This pattern is consistent across all four metrics.